Clonal hematopoiesis of indeterminate potential (CHIP) is a precursor state characterized by expansion of blood clones carrying somatic mutations that originate in the hematopoietic stem cells (HSCs). Individuals with CHIP have a 10-fold increased risk of developing myeloid malignancies such as myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML). Progression of CHIP to MDS/AML takes several decades and hence targeting the CHIP clones during this period provides a viable strategy to prevent MDS/AML. Despite its importance, we still don't understand the mechanisms of CHIP predisposition and its progression. Recent genome wide association studies (GWAS) have identified several non-coding genetic loci that are strongly associated with CHIP, but the causal variants remain unknown because most of the variants are found in non-coding genomic regions. We have previously shown thatmodulation of gene regulatory elements in HSCsunderlies genetic risk of myeloid malignancies; whether this is also true for premalignant state such as CHIP is currently unknown.

We hypothesize that germline CHIP risk variants in non-coding loci affect enhancer elements in HSCs. In order to identify these variants we utilized a high throughput approach termed Massively Parallel Reporter Assay (MPRA) to simultaneously screen for regulatory effects of 1,374 non-coding variants within 24 GWAS loci from a comprehensive study of CHIP risk on ~600,000 individuals. In the MPRA approach, we synthesized pools of barcoded lentiviral constructs harboring non-coding variants (risk and non-risk alleles) and surrounding genomic sequence (250bp) upstream of a minimal promoter driving a reporter gene. Sequencing of the plasmid pool confirmed an uniform distribution of barcodes with a median of around 50 barcodes per variant. Since enhancers show cell-type specific activity, we next searched for the appropriate cell-type to perform the MPRA. As the disease relevant population in CHIP is the HSCs, which is rare, we searched for alternative cell types. We identified MUTZ-3, a AML cell line that maintains a self-renewing CD34+ population and is dependent on HSC transcription factors. We confirmed that MUTZ-3 cells have trans-acting factors similar to primary HSPCs, by testing several validated HSPC enhancers including PU.1 upstream regulatory element (URE) and GATA2 inv(3) enhancer using individual reporter assays.

We performed our lentiviral MPRA screen using a library of ~73,000 constructs in CD34+ fraction of MUTZ-3 cells. 48 hours post-infection we lysed the cells for DNA and RNA and sequenced the barcodes by NextSeq. The experiments were highly reproducible and all 5 replicates correlated well with each other at both the DNA and RNA barcode level. When we measure the enhancer activity by ratio of RNA to DNA barcode read counts, positive controls in the pool (PU.1 URE, GATA2 enhancer) showed higher activity compared to negative ORF controls. The activity of test constructs that overlap with hematopoietic ATAC peaks was higher that non-overlapping constructs indicating MPRA can identify endogenous enhancers. We identified several hits that showed allele specific reporter activity based on CHIP variant (Ref and Alt allele).

We prioritized rs17834140, a variant identified by the MPRA for further validation studies. The variant is located within the 4th intron of the MSI2 gene in an hematopoietic ATAC peak and MPRA and individual reporter assays indicate enhancer activity and the risk allele increases MSI2 expression. We further confirmed endogenous enhancer activity by both CRISPR/Cas9 deletions and CRISPR inactivation experiments in human hematopoietic cells. To test whether MSI2 overexpression can also contribute to clonal expansion of CHIP mutant HSCs, we infected LSK cells from TET2 knockout mice and wildtype controls with the MSI2 overexpressing and control lentiviral vectors and observed an increased in vitro serial colony replating ability of TET2 cells upon MSI2 upregulation.

Our study paves the way to systematically assess non-coding risk variants associated with CHIP risk. We expect the MPRA approach will help to draw out key unknown myeloid neoplasm predisposition genes and their regulatory elements that are influenced by the germline variants.

This content is only available as a PDF.
Sign in via your Institution